details widget name

CLIR Overview

Chapter details

Introduction

The conventional search engine implement the process of finding relevant information mainly from statistical point of view. Such approach however introduces a lot of "noisy data" or data that a user will never search for. Certain attempts to use some linguistic knowledge has been successfully implemented in the past, e.g. stop-words and stemming for the different languages. ATLAS exploits the fusion between high-performance statistically based search engines and semantic technologies by integrating a state-of-the-art cross-lingual retrieval (CLIR) system Nebula5 (http://www.rrz.uni-hamburg.de/index.php?id=1392).

Nebula5

The Nebula5 System is a fairly generic framework that supports building information management applications. Current search engine technology is mostly based on (lexico-syntactic) open web search, which in turn is based on common information retrieval techniques. These provide only basic tools, which are not very effective in a highly socialized and informationwise fine grained environment. Semantic Web methods, on the other hand, are well suited to shape indexed knowledge according to the real informational situaton and needs of institution members: providing semantically rich machine readable information about resources and the principle of distributed extensibility are key aspects of the Semantic Web theory. These thoughts lead us to an approach that combines enterprise search with semantic web technology. While this in general is not really a novel idea, we will add some new aspects to it, expecting to overcome the difficulties we described aforehand. We propose that fusing search engine and semantic web technology at the right level, i.e. enabling semantic annotations and intra-institutionwise distributed extensibility – while maintaining freetext search functionality – will create a certain amount of synergy which can raise the effectiveness of a semantic search approach in an institutional (enterprise) environment.